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ABSTRACT 

Two statistical prediction models for the 24-hour surface pressure 
change are developed. One model employs the terms in a dynamic 
model as the independent variables in a linear regression equation. 
The other model combines these variables with parameters capable of 
reflecting the long-wave, long-term influences in a multivariate 
discriminate analysis. The regression equations were developed from 
data taken from the month of November 1962 at 50N latitude. A 
discussion of the results of both methods is presented along with a 
critique of the procedures used in obtaining the data. 

The writers wish to express their appreciation for the guidance and 
encouragement given them by Professor Frank L. Martin of the 
U.S. Naval Postgraduate School in this investigation. 

They are also indebted to Lieutenant Commander Mildred J. Frawley, 
United States Navy, of the Fleet Numerical Weather Facility for 


assistance in computer programming. 
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Number of independent variables 
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Potential temperature 
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1. Introduction 

Numerical techniques for the sea-level pressure prognosis have 
not shared the same success as those used for forecasting the 500-mb 
contour map. Methods presently in operational use consist in general 
of a 500-mb prognosis coupled with prognosis of thickness using some 
form of the first law of thermodynamics. Methods similar in nature 
have been suggested by Haltiner and Hesse (1958| and Reed | 1956] . 
In addition, it has become standard practice with such units as the 
U.S. Navy Fleet Numerical Weather Facility, Monterey, California, 
(F NWF) to superimpose certain empirical corrections on the location 
and intensity of cyclones and anticyclones.. The inadequacy of 
numerical methods to predict cyclogenesis and anticyclogenesis is 
perhaps the greatest deficiency of the presently operational numerical 
techniques. This study was undertaken to help eliminate some of 
these problems. 

Two models were developed: (1) a model using solely dynamic 
parameters in the form of a multiple linear regression equation; and 
(2) a stepwise multivariate regression analysis using the ''dynamic'! 
predictors as well as certain other parameters selected to reveal 
long-term and long-wave influences on the surface pressure change. 
For convenience the two models are hereafter referred to as ''the 
dynamic model" and ''the statistical model'', respectively. Separate 
and detailed descriptions of the two models are presented in the 


following sections. 
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Latitude 50N in early winter months was chosen as a latitude 
offering typical, if not difficult, forecast problems for the investigation. 
The dependent data were taken, insofar as possible, from 30 consecutive 
days beginning in late October, 1962. Early December of the same 
year provided five days of independent data. Twenty-four geographical 
locations, at intervals of 15 degrees of longitude around the latitude 
belt were chosen as ''stations''. The ''stations'' were arbitrarily 
numbered from 1 to 24, starting at ocean station ''Papa" in the Gulf of 
Alaska, in an eastward direction. 

The data used in both models were taken from numerical computa- 
tions and printout charts prepared by use of the CDC 1604 electronic 
digital computer. The programs and data tapes necessary to compute 
and print the charts were supplied by FNWF. All statistical computa- 
tions were made on the same computer using selected programs from 
the BIMD* Fortran library. 

The techniques employed in this work were not designed to supplant 
the numerical methods now in use. This effort was conducted so as 
to illuminate some of the factors not now considered that may be in- 
fluential in causing a sea-level pressure change. Time steps of 


twenty-four hours were attempted in connection with both models in 


aN compilation of statistical electronic computer programs that were 
compiled and edited by the Biology and Medical Department of the 
University of California, Los Angeles. This manual is distributed 
through the UCLA campus bookstore. 
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order to test the efficacy of this longer-than-normal time step in 


conjunction with statistical methods. 





2. A derivation of the dynamical prediction equation 

The following development is taken, after Reed fi962| ,» with some 
modifications and simplifications in the later stages of the derivation. 
Pertinent remarks will indicate where these modifications are 
applicable. 

The frictionless vorticity equation for the 1000-mb level may be 


well approximated by 


a (Kh+ef)=-VYe VE t+ F (2), 2) 
ane oP 


If we assume a parabolic vertical velocity profile between the 


surface and 500 mb (subscript 5) of the form 


2 
t= Wot (W5-Wo) | 1 — Cs) - 


Fo Ps 


and substitute for ( Qu2) in (1), we obtain 
ap 7° 
a (hef=-V - VWO+F) e.2t (Wa - We ) e (3) 
ot Po~ Ps 


The geostrophic wind is used to approximate the vorticity and, 


following Reed, the 1000-mb contour pattern may be regarded as 
consisting of a set of equally-spaced circular highs and lows of the 


form 


oO 


= = “Fey + Bain 20x SIN EIT y (4) 
J LK a 


superimposed ona constant zonal current U. Here x and y are the 


eastward and northward distance elements, respectively, and the 
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absolute constant resulting from the integration of U along the meridian 


is taken as zero. Then the 1000-mb relative vorticity becomes 


J, = 3.V°Z. = 259 [Ze + fe Uy] . (5) 


Substituting (5) into (3) yields the equation 


| oe ON WZ al ETL JAF {® (Wax we 
gi l-Zor6 fey) =- MWC zoe fey) EE CRB 
(6) 


GiefLt | H'=G- Say . 
Sir*g J 


The particular form of the adiabatic thermodynamic equation 


used here is 


at a i 


With the assumption of a linear geostrophic-wind hodograph in the 
layer between 1000 mb and 500 mb, equation (7) may be integrated 


with respect to p from 1000 to 500 mb to give 


2 (Z5-Z,) =-V, V @s-Z) +720 =Bs) (Zure+u,) « (8) 
dt | 
Next multiply (8) by 


Z| 


‘Ze : (9) 
En <"30 (Po-Ps)™ 


As 
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a slowly varying parameter which will be regarded here as constant, 


and add the modified version of (8) to (6) giving 


ofztHek (Zg- Z,)|=-\ VizoHs +-b(Z,-Z zZ,) [4 te, (10) 


Use of the kinematic boundary condition tig=\yoVp, allows the 
vertical velocity (4% to be included within both brackets of (10) asa 
terrain effect. This last effect 1s not considered in this paper, since 
it is not one of the dynamic factors explicitly selected for the statis- 
tical regression employed in this study. Hence we are left with the 


result 


2 PEt tk (ZZ eMULZtHt bi (Z.-2)| a 


as the prediction equation. 


Equation (11) is now rearranged to give 
3 [|e £5 -(/+ bIL 4 H'] =V,.V[d'Z-( +h') Zot H | (12) 


or 


2 [REs-Z.+H]=WeV[ REZ +H] 
R= hArk’), H = Hl Lh’), 


We next make use of the principle,first introduced by Fjortoft 


(13) 


es 
j1952] and extended by Reed 1962} , of employing an equivalent 


advecting wind which gives the same instantaneous advection as VW, 5 
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but which has the property of changing more slowly with time. This 
concept has proved to be particularly valuable in graphical integration 
of the dynamical equations where long time steps are employed. 


Thus (13) may be written in the equivalent form 


Q. [Rt5-Zot H]=-Ve.V Lk ey one H | (14) 


or, alternatively as 


0 Zo = Ve Vi kes-Z+H | + 2 (hes) , (15) 
Jt 


where 


Vorkx g VE, Viz = kx 9 Vee ; Ze=[hos+ HI. (16) 





3. Simplification of the dynamic model 

In Reed's model the scalar field of Ze whose gradient defines Ve 
has a term (-M) representative of terrain effects in addition to those 
in equation (16). As already noted, in approaching the statistical 
application of the dynamic model we have set M=0. 

A word of discussion regarding Reed's function G and our function 


H is appropriate. From equatiors(6) and (9), Gis expressible as 


G= GAok) =f Aik’) , L= Lracosm , 
Srity N 


Reed uses a mean value of k=0.55 or k=1.22 for wave number N=6 

in connection with his 12-hr Lagrangian prediction technique. It may 
be shown that G= Bite) | sinew and hence G is an increasing 
function of latitude up to > = 45N. For 45 <¢ f < 55 degrees which 
is typical of the range of latitude encountered in this study, Gisa 
slowly decreasing function. The mean northward gradient of the G- 
field centered at latitude 50N over a 10-degree span, is such that 

Y, = o knots in an eastward direction. 

Recall that our function H is given by G- ay , and using 
(1+K)=2. 22 it follows that the term involving U is equivalent toa 
mean maximum zonal wind (according to Namias and Clapp fi951] ) 
of 2 knots. In lower latitudes the effects of the Gand U terms are of 


opposite algebraic sign whereas in the latitude belt of this discussion 





the gradient of His equivalent to an eastward wind of 5 knots. Con- 
sequently the H-field in (16) has been neglected relative to KZ, in the 


Ze field. With this simplification the prediction equation (15) becomes 


IZ, = Ve Vh +9, (hz) 17) 


Pee 


v 


Since hgz._-z 
Oo 


5 
The last term (kts) in (17) may be obtained by employing the 


graphical prediction technique for the barotropic model as described 
in Haltiner and Martin [1957, pp. 395-398 | it Z. represents the 


space-mean 500-mb height, the Fjortoft method leads to the result 


a(Z, -Z,+ T)=—-Vy 5 V (25-Zs +J) (18) 


at the level of nondivergence (assumed to be 500 mb). Fjortoft's 


space-mean advecting wind VWs Ugh is given by 


ZiT 


Ww. = ghkxV(z+J). (19) 


We therefore have the familiar result 


ts = We, 5 VW lé5- 7,+J)-9#s (20) 


The Fjortoft graphical treatment makes use of the following function 


for J: 


S - [aiinmsecoses 
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Petterssen [1956. Pp - 392] gives the distribution of the J-field 
based upon a grid mesh d=1000 km. At latitude 50N, the gradient of 
J which he shows is equivalent to an easterly geostrophic wind of 
8.5 knots. In regions of maximum advection of (z-2tJ), it appears 
reasonable on the basis of a statistical approach to approximate the 
500-mb advecting space-mean wind Vaay by V3, where Wz is 
the space-mean geostrophic wind at 500 mb. Moreover, since our 
"stations'' are far apart and confined to the 50N latitude circle, the 


working hypothesis made in the following statistical analysis is that 


ge gives only ''feedback'' contributions to the termV3.V(é5Z) 
For simplicity the 1000-mb height change resulting from (15) and 


(18) becomes 


gee =V¥.Vh+k Ve V (Zs-Zs tJ). om 


Note that since Ve =k Vez it follows that for 24-hr advection it is 
appropriate to consider the first term on the right side of (21) asa 
reduced advection. Furthermore the vector A Vs in (21) has also 
been treated as the reduced advecting wind Vz ; nee an inter-~ 
pretation of this kind has been found useful in 'advecting'' the movement 
of rain areas associated with progressive sea-level cyclones by 

Renard [1959] . We have noted that Reed gives the value k=0.55 

as appropriate to a Lagrangian forecast technique with 1l2-hr time 


increments. In our computations k was rounded off to 0.5 in view of 
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the subjectivity of hand measurements of the space-mean geostrophic 
wind Wa . Hence with these simplifications our prediction model 


now becomes 


2%. = A+ AEVeVh]+A, Eve VE, er], 


(22) 


Here the regression coefficients A A» A, are introduced as un- 
knowns in order to absorb any statistically-determined feedback 
relationships contained in the preceeding analysis, such as for example, 


the assumption Bia Vs.5 Wl25-257J)= Ve (een 2a) 6 


In equation (22) the best-fit determination of AS A)» and A, will 
be made by least squares. In essence, however, this equation presents 


a dynamically formulated problem similar to that of Reed }iss6) and 


Haltiner and Hesse }i958/ ; 
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4. Computational procedures for the dynamic model 

The local change of the 100-mb height in equation (22) is written for 
a 24-hr time step and is considered to be strictly proportional to the 
24-hr change of sea-level pressure, £Pe , which serves as the dependent 
variable. The advecting terms in the working equation (22) are then 
used as the independent variables in a multiple linear regression 


equation of the form 


Y =A4+A,X, + 4X, 
oat | 
a Ao + AF Ve-Wh] + AZ EVe-V (Fe-Zs5+d) 


23) 


é 


- The advecting terms were evaluated by a quasi- Lagrangian technique 
using fixed-point computations at points determined by trajectory 
tracing. At each of the 24 stations for 30 days an upwind point was 
determined. This was accomplished by specifying a geostrophic wind 
from the 500-mb space-mean height field terminating at the station in 
question and then tracing the upwind trajectory in the contour channel 
for a distance corresponding to 24 hours. In the cases of difluent 
and confluent contours 6-hour steps on the initial chart were used so 
that more representative space-mean winds were available at each step. 

The parameters z and J as used in the vorticity term were computed 
using a square grid distance of 782 km. The space-mean 500-mb height 


field used to determine the advecting wind was obtained by subjecting 
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the 500-mb heights to four scans using a ''smoother'' of the type 
2 
(Ai). = + (A;. Ai) 
Aij). ~ 3 pe V Oi T (24) 


where subscripts S and I refer to smoothed and initial values, 
respectively. 

As previously noted, k inVe= kV , was taken to be 0.5 and the 
wind speed in all cases was reduced by this factor. With this choice 
of advecting wind, the upwind point thus determined for fields of both 
h and (2-z+J) was the same for both advection computations. The 


difference of the values of h and (z.-2 +J) from their initial values 


5 
over the station was recorded as the 24-hr advective change. 
FNWF data tapes served to give printouts of the entire fields 


for all 24-hr forecast periods under consideration. Values of Apo 


at the stations along 50N were then obtained by bilinear interpolation. 


te) 





5. Results for the dynamic model 

The BIMD 06 program of the BIMD library was used to perform the 
least-squares regression analysis based upon the use of equation (23) 
for the dynamic model. This was accomplished for each of the three 
data stratifications shown in Table l. Relevant statistics obtained by 
this analysis are displayed in Table 2. Here PRV stands for per cent 
reduction of variance, while R is the multiple correlation coefficient. 

After analyzing the results obtained by stratifying the data it was 
decided that due to the greater percent reduction in variance given by 
data from the fifteen stations over or near the land areas that this 
group would be used to develop the final regression equation. The 


resulting equation was(land areas only): 


Ap, =6.59-.04(¥,-V4))- LBCv. Wh). (25) 


Note that the regression coefficients are negative indicating that 
advection of higher values of 0 and of h each cause a negative 
contribution to the pressure change. This agrees with usual synoptic 
observations. 

Since the assumption of a linear relationship between the predictors 
and the predictandwas not necessarily valid, scattergrams of Ape 
versus both independent variables were examined in order to investigate 


possible indications of a preferred relationship. The BIMD 27 program 
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a adapted to the CDC 1604 computer was employed to plot these scatter- 
grams. The results using thickness and absolute vorticity advection, 
respectively, are shown in figures 1 and 2. Examination of these figures 
shows that while a linear regression between Apo and -Ve WA h 

is reasonably valid, the relationship between A Pe and ~\V-.V 17 
Spree to be non-linear with no obvious correlation. 

The simple correlation coefficients found between A Pe and -Ve-VWh 
and-V;V 4 were -.4land -.04 respectively (see also Table 2a). 

Note that the percent reduction in variance attributed to the partial 
correlation of the vorticity advection is insignificant, indicating that 

the thickness-advection parameter will give equally good results when 
used alone as a predictor. This was not due to a significant correlation 
between the variables VeVh and VeVi , Since their simple 
correlation coefficient was only 0.17. 

The regression equation developed from these parameters was tested 
on an independent sample of 75 cases drawn from data gathered in the 
month of December with the results shown in Table 2b. 

The apparent lack of success of the 24-hour absolute vorticity 
advection, as measured by the method here employed, to furnish 


Significant predictability for the subsequent 24-hour pressure change 


acai symbol 4, has been introduced to represent (2-2 ae 
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was surprising, particularly in view of the comparative usefulness of 
the thickness advection. Some of the primary reasons for these 
differences in the statistical behavior of these two predictors were 
revealed upon a critical re~examination of the initial-data fields. These 
are discussed in subsections (a), (b) and (c), below, with some re- 
sulting conclusions in (d). 

(a) Advection of 500 mb ae vorticity. The values of this variable 
are sensitively dependent upon the field of Z,-Z,tJ. This field had the 
characteristic of exhibiting unusually strong gradients over short 
distances near the vorticity centers, with relatively weak gradients 
elsewhere. Thus any cumulative error in constructing a 24-hour upwind 
trajectory, based on the use of the equivalent-advecting wind Ve 

can give rise to sizeable differences in the value of Zp-Z,4J to be 
advected. Reed (i962) has also referred to the question of trajectory 
aecuinacy . 

(b) Advection of thickness. The printed.fields of h = Ze- Zz. displayed 
a smaller degree of non-linearity over short distances and were sub- 
ject to considerably smaller upwind-point error. The difference in 
appearance of the Z,-2,+J and Ze- fields may be attributable to 

the smoothing process used in obtaining Ze whereas no smoothing was 
employed in obtaining the thickness field. 


(c) Approximation Ve=kVe for advection of absolute vorticity 


In a small percentage of cases of 24-hour 500-mb advection the value 


16 





of + [Vs (YY (Z5-Z,4 ){ was not equal to that using the 
advecting wind Wp =4 Vex . This occurred when a 24-hour 
trajectory passed over an extreme vaiue of ear) 

(d) It must be concluded that the statistical Lagrangian technique 
employed here suffers from the defect of ernployirg excessive time- 
steps. While the procedure bears some similarity to the Fjortoft 
technique, it does not have the advantage of scanning comparative data 
from all latitudes within the grid map Hence the smoothing capability 
usually available in most prognostic procedures (and in analysis in 


general) could not be used as a prognostic aid here. 


wi 





6. A multivariate linear regression analysis for 24-hour prediction 
of sea-level pressure 

In this phase of the investigation a total of 28 independent variables 
were tested as possible predictors ina purely statistical approach. 
Recent investigations by Ostby- Viegas [ 1960] , Miller [1962 | , and 
others employing statistical reduction methods have utilized the 
advantages of a computer to reduce large numbers of possible predictors 
to a significant few. Such methods may use either objectively-determined 
data with no immediate rationale for the relationship, or dynamically- 
based data to select variables. Such significant multiple linear re- 
gressions which exist are found by a statistical screening process. 
Once an objectively chosen variable has been selected through the 
screening process it is usually possible to find a causal relationship 
between it and the predictand on the basis of synoptic-dynamic 
considerations 

In this section, the intent was to utilize the two predictors already 
employed in the dynamic model (see equation (23)). Since the predictors 
already chosen rest heavily on 500-mb ere it was decided to 
choose, as far as possible, additional objective parameters from this 
level. Furthermore, since the month of November 1962 was charac- 
terized by contrasting mid-latitude regimes in the Pacific and Atlantic 
Oceans, with higher than normal mid-latitude zonal flow in the Pacific 


and anomalous blocking action in the Atlantic, it was felt that the 
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24-hour values of the dynamic (advective) parameters might contain 
considerable scatter at stations influenced by such contrasting meteoro- 
logical activity. Consequently it was decided to select a limited number 
of 500-mb parameters indicative of the long wave features and, to some 
degree, of the extended-period circulation anomalies. Recourse is 
made here to some of the extended-forecast concepts of Namias [1951] 

At each station the latest 500-mb height Z and those for each of the 
three preceeding 24-hour map times were read off or interpolated from 
the contoured printout charts. From these data the two sets of para- 
meters, the two-day trend Ti and three-day mean Zi were computed 


for each station, and a regression equation of the form 


Ape: = Bio + Bi LVeV 2) +Biz Veh) (26) 


+ 2B ai i-Z)j Cosi liig * Due: tky: 


is sought, where jisa 15° longitude interval, considered positive 

eastward. The subscript i-2j , indicates that we are examining data 
: ; Oo ‘ 

at all upwind points 30 longitude apart. Note that the two-day trend 


at station iis given by 


im = (27) 


and Z,; tor simplicity has been defined here as the simple arithmetic 


mean -— 


Ly. a ae Le Z 2 ) 14, (28) 


Ly, 





The implication of the summation sign before the ie ra and Vi (-z3 
terms is that we are considering all of the large-scale upwind and down- 
wind rates of change and mean heights, in contributing predictability 
to AP, at the site denoted by the subscript ie 

The variables U~s and Vi are the 850-mb zonal and meridional 
geostrophic-wind components at station om. They are presented in 
equation (23), firstly, in order to include possible relevant low-level 
effects, and secondly, in order that a physically important factor such 
as UV (where the superior bar jira: cH a zonal mean) may at least 
be implicit within the set of independent variables. The significance 
of this cross-covariance is that it suggests effects of the zonal-index 
cycle (see Haltiner and Martin,[1957, pp. 446-448] e 

With 28 possible predictors appearing in (23), the BIMD 09 program 
was used to perform the regression analysis one statistical model. 
This program is a modification of one originally written by M. A. 
Efroymson (1955 | of the Esso Research and Engineering Company. 
The important features of the program include a stepwise screening 
of the predictors using arbitrary upper and lower critical F-values 
as cutoff limits for inclusion or rejection of variables. 

The F statistic as employed in this test is the ratio of the mean 
squares explained by the regression to the residual or unexplained 
mean squares. According to Anderson (1960 the F-ratio is the ratio 


of two chi-square distributed variables with k and n-k-1 degrees of 
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freedom, where n is the number of cases inthe data sample. Miller 
[1962] suggests values for the critical or cutoff F-values for introducing 
predictors into (23). If P is the total number of possible predictors 

(28 in this investigation) and k is the number of predictors already 


selected, his critical F-value is given as 


F et ae = i (29) 
P_fe+| 
here ox = ook . The value&h* = .05 is the usual critical 





P-Ke+ | 


Significance level in the selection test. It is apparent that the level 
imposed by this method of determining a critical F-value will decrease 
as more predictors are chosen. 

Inasmuch as the BIMD 09 screening program uses a fixed F-level 
throughout the screening process, it seems desirable to perform several 
regression analyses of the data with differing Boles but retaining one 
coinciding with Miller's (F = 10.0). Ina recent analysis (Martin et al., 
[196 3) the recommendation is made that a lower F-level for rejection of 
a previously selected variable should be taken as zero when using the 
BIMD 09 program with Miller's selectioncriterion. Accordingly, 
predictors were arbitrarily selected in this analysis with upper critical 
F-levels of 10.0 and 5.0, and a lower limit in each case of zero. 

Selected parameters are shown in Table 3 in the order in which they 
were chosen by the analysis of the dependent data. The form of the F- 


test used to compute the significance of the final regression equation 


a 





is given (after Anderson,| 1960, 2.89 | ) as below: 


PCR n-h-1) = fF x )(2 1oA-| (30) 


where R is the multiple correlation coefficient. The percent reduction 


; “se 
in variance, R_, is given by the formula 


a \ 
R= 1-/S, (31) 
ee 
where S, the reduced standard error, and 4 , the total standard 


error, are available at each step of the BIMD 09 program printout. 
The cumulative percent reduction in variance was computed for 
each step of the regression (see Table 3). Note that using Miller's 
suggested initial F-level of 10.0 only two parameters are chosen by the 
screening process. These significant predictors are ay i Vh ; 
the advection of thickness as employed in the dynamic model, and sik 
the 2-day 500-mb height change over the station. Together, these 
parameters give a cumulative percent reduction in variance of 18.43. 
When the F-level is lowered to 5.0 four additional parameters are 
selected, each of which contributes a relatively small gain in PRV. 
The four additional parameters selected (in the order chosen) were 
—\ ? Vr or the advection of Zp- Zpt J as employed in the dynamic 
model; Uj , the geostrophic component of the 850-mb wind over the 


station; J;_¢ , the 2-day 500-mb height trend 90° upstream; 
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and Wa i-io » the 3-day mean 500-mb height 150° upstream. The 
resulting regression equation which includes the six selected variables 
(see Table 3) is 


Ape = 116.1 — L6G Me Wh) — 1S MEW) 


ae (32) 
ae i -, 54u; = ik BS mash ne 

While the computed F-levels indicate that these last predictors are 
statistically significant it is possible that such correlations as are 
indicated arise from "noise'' and/or erroneous data. Under such circum- 
stances the regression equation may tend to "overfit'' the sample. A 
discussion of this effect is given by Panofsky and Brier [195s, p.176| 

The existence of "'overfitting'' of the dependent sample is demonstrated 
by the instability of the regression equation when applied to the independent 
data. This phenomenon may evidence itself by a substantial decrease 
(''shrinkage") in the percent reduction of variance explained with the in- 
dependent sample. For example, when equation (32) was tested on an 
independent sample of 75 cases, a PRV of 10.6 percent occurred, compared 
to 22.04 percent for the dependent sample. Although the ''shrinkage" here 
is large, some stability of the final equation is indicated and an improve- 
ment shown over the dynamic model. 

From practical considerations we wish to use only the most efficient 
predictors. Those which offer little improvement in predictability (by 
the added percent reduction in variance criterion) are chiefly of 


theoretical interest. Undoubtedly the most practical prediction 
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equation givingthe least overfit would involve the two parameters 
selected in the statistical model with ie 710 , namely, advection of 
thickness ~V = oY h and the : 2 -day height difference ihe . The 
usefulness of those predictors with lower F-levels is doubtful, especially 
when the small percent reduction in variance achieved by their use is 
considered. 

Errors in data sampling have already been referred to in section 6, 
particularly in reference to the dynamic predictors. Other errors are 
map-scale error and interpolation errors, both of which are considered 


to be negligible in this instance. 
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7. Conclusions 

The prediction equations developed by this investigation do not offer 
a significant improvement to existing methods. This is felt to be due 
partly to the limitations inherent in applying the statistical technique as 
a linear operator and partly attributable to the fact that single-point 
observations fail to capture gradient effects in the manner of a closely 
spaced grid. Considering that the chosen latitude and season indicate 
that baroclinic development can normally be expected, any useful 
filter should predict non-linear effects in a consistent fashion. The 
results obtained by applying the developed regression equations to the 
independent samples indicate that this is not being done and that we 
must recognize some of the shortcomings of the measurement methods. 
The most immediate improvements suggested from the results are: 
(1) decreasing the Lagrangian time step;and (2) employing an entire 


grid map to verify actual Apo patterns. 
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Table 1. Stratification of data for analysis of the dynamic model 


Grouping Number of Population size 
stations in 
sample 
Individual 
stations M 30 
Continental 
stations 15 450 
Ocean stations 9 Zo 
Total of stations 24, 720 
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Table 2. Summary of pertinent statistics for various sample 
stratifications for the dynamic model 
(a) Dependent data 
Grouping Sample Variable PRV R Final F 
“ size value 
Total “Ve -V% 002 
stations 720 Ve Wh 144 .382 61.31 
combined . 146 
Land -YWeE-V%? .002 ; 
stations 450 “Ve Wh 179 .426 49.64 
combined .181 | 
Ocean Sey VA .004 
stations 270 -We Vh 130 .366 20.62 
combined .134 
(b) Independent data 
Grouping Sample Variable PRV R Final F 
size value 
Land 
stations 75 combined .O1 .10 < 
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Table 3. Summary of results obtained from dependent data for the 
statistical model 





Predictor F level Percent Coefficient 
on entry reduction in in regression 
variance equation 
(FZ 5.0) 
~Ve.-Wh 90.6 16.65 m6 
ale 10.8 1.78 -.28 
c 
= NN = 6.9 1.06 -.18 
= V/ 


(90° eke cain) i. nee es 


o_e 


at; o 5.0 i #14 
(150° upstream) 


Constant 
term ~ > 116.1 


Standard deviation of Ape ; ey = 7.484 mb 
R2 = 0.22 
R = 0.48 
F(6, 443) = 20.9 
F.(.99) = 2.85 
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